Tiling as a Durable Abstraction for Parallelism and Data Locality
نویسندگان
چکیده
Tiling is a useful loop transformation for expressing parallelism and data locality. Automated tiling transformations that preserve data-locality are increasingly important due to hardware trends towards massive parallelism and the increasing costs of data movement relative to the cost of computing. We propose TiDA as a durable tiling abstraction that centralizes parameterized tiling information within array data types with minimal changes to the source code. The data layout information can be used by the compiler and runtime to automatically manage parallelism, optimize data locality, and schedule tasks intelligently. In this paper, we present the design features and early interface of TiDA along with some preliminary results.
منابع مشابه
Combining Performance Aspects of Irregular Gauss-Seidel Via Sparse Tiling
Finite Element problems are often solved using multigrid techniques. The most time consuming part of multigrid is the iterative smoother, such as Gauss-Seidel. To improve performance, iterative smoothers can exploit parallelism, intra-iteration data reuse, and inter-iteration data reuse. Current methods for parallelizing Gauss-Seidel on irregular grids, such as multi-coloring and ownercomputes ...
متن کاملAutomated Tiling of Unstructured Mesh Computations with Application to Seismological Modelling
Sparse tiling is a technique to fuse loops that access common data, thus increasing data locality. Unlike traditional loop fusion or blocking, the loops may have di erent iteration spaces and access shared datasets through indirect memory accesses, such as A[map[i]] – hence the name “sparse”. One notable example of such loops arises in discontinuous-Galerkin nite element methods, because of the...
متن کامل2-D Wavelet Transform Enhancement on General- Purpose Microprocessors: Memory Hierarchy and SIMD Parallelism Exploitation1
This paper addresses the implementation of a 2-D Discrete Wavelet Transform on general-purpose microprocessors, focusing on both memory hierarchy and SIMD parallelization issues. Both topics are somewhat related, since SIMD extensions are only useful if the memory hierarchy is efficiently exploited. In this work, locality has been significantly improved by means of a novel approach called pipel...
متن کاملHPVM: A Portable Virtual Instruction Set for Heterogeneous Parallel Systems
We describe a programming abstraction for heterogeneous parallel hardware, designed to capture a wide range of popular parallel hardware, including GPUs, vector instruction sets and multicore CPUs. Our abstraction, which we call HPVM , is a hierarchical dataflow graph with shared memory and vector instructions. We use HPVM to define both a virtual instruction set (ISA) and also a compiler inter...
متن کاملThe Deleterious Nature of Interacting Tiling Optimizations
A compiler may perform multiple optimizations, each with its own goal and cost function. While it is acknowledged that optimizations can interact, in practice the interactions are often ignored, and assumed to have no deleterious eeects. In this paper, we demonstrate for optimizations involving tiling that the interactions have unexpectedly harmful eeects on overall performance. Current trends ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013